Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup
نویسندگان
چکیده
Knowledge distillation (KD) has emerged as an essential technique not only for model compression, but also other learning tasks such continual learning. Given the richer application spectrum and potential online usage of KD, knowledge efficiency becomes a pivotal component. In this work, we study little-explored important topic. Unlike previous works that focus solely on accuracy student network, attempt to achieve harder goal – obtain performance comparable conventional KD with lower computation cost during transfer. To end, present UNcertainty-aware mIXup (UNIX), effective approach can reduce transfer by 20% 30% yet maintain or even better than KD. This is made possible via uncertainty sampling novel adaptive mixup select informative samples dynamically over ample data compact in these samples. We show our inherently performs hard sample mining. demonstrate applicability improve various existing approaches reducing their queries teacher network. Extensive experiments are performed CIFAR100 ImageNet. Code available at https://github.com/xuguodong03/UNIXKD.
منابع مشابه
Knowledge and Efficient Computation
We informally discuss "knowledge complexity": a measure for the amount of knowledge that can be feasibly extracted from a communication. Our measure provides an answer to the following two questions: 1) How much knowledge should be communicated for proving a theorem? 2) How to prove correctness of cryptographic protocols? We sympathize with the readers who are distressed by the level of informa...
متن کاملEfficient SimRank Computation via Linearization
SimRank, proposed by Jeh and Widom, provides a good similarity measure that has been successfully used in numerous applications. While there are many algorithms proposed for computing SimRank, their computational costs are very high. In this paper, we propose a new computational technique, “SimRank linearization,” for computing SimRank, which converts the SimRank problem to a linear equation pr...
متن کاملEfficient Knowledge Distillation from an Ensemble of Teachers
This paper describes the effectiveness of knowledge distillation using teacher student training for building accurate and compact neural networks. We show that with knowledge distillation, information from multiple acoustic models like very deep VGG networks and Long Short-Term Memory (LSTM) models can be used to train standard convolutional neural network (CNN) acoustic models for a variety of...
متن کاملLearning Efficient Object Detection Models with Knowledge Distillation
Despite significant accuracy improvement in convolutional neural networks (CNN) based object detectors, they often require prohibitive runtimes to process an image for real-time applications. State-of-the-art models often use very deep networks with a large number of floating point operations. Efforts such as model compression learn compact models with fewer number of parameters, but with much ...
متن کاملEfficient Graph Edit Distance Computation and Verification via Anchor-aware Lower Bound Estimation
Graph edit distance (GED) is an important similarity measure adopted in a similarity-based analysis between two graphs, and computing GED is a primitive operator in graph database analysis. Partially due to the NP-hardness, the existing techniques for computing GED are only able to process very small graphs with less than 30 vertices. Motivated by this, in this paper we systematically study the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition
سال: 2023
ISSN: ['1873-5142', '0031-3203']
DOI: https://doi.org/10.1016/j.patcog.2023.109338